Goto

Collaborating Authors

 Animation


Does Toy Story 5 Fumble Its Big Emotional Moment? Let's Break It Down

TIME - Tech

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?



MoCha: Towards Movie-Grade Talking Character Generation

Neural Information Processing Systems

Recent advancements in video generation have achieved impressive motion realism, yet they often overlook character-driven storytelling, a crucial task for automated film, animation generation. We introduce Talking Characters, a more realistic task to generate talking character animations directly from speech and text.


EverybodyDance: Bipartite Graph-Based Identity Correspondence for Multi-Character Animation

Neural Information Processing Systems

Consistent pose-driven character animation has achieved remarkable progress in single-character scenarios. However, extending these advances to multi-character settings is non-trivial, especially when position swap is involved. Beyond mere scaling, the core challenge lies in enforcing correct Identity Correspondence (IC) between characters in reference and generated frames. To address this, we introduce EverybodyDance, a systematic solution targeting IC correctness in multi-character animation. EverybodyDance is built around the Identity Matching Graph (IMG), which models characters in the generated and reference frames as two node sets in a weighted complete bipartite graph.


๐ŸŽงMOSPA: Human Motion Generation Driven by Spatial Audio

Neural Information Processing Systems

Enabling virtual humans to dynamically and realistically respond to diverse auditory stimuli remains a key challenge in character animation, demanding the integration of perceptual modeling and motion synthesis. Despite its significance, this task remains largely unexplored. Most previous works have primarily focused on mapping modalities like speech, audio, and music to generate human motion. As of yet, these models typically overlook the impact of spatial features encoded in spatial audio signals on human motion. To bridge this gap and enable high-quality modeling of human movements in response to spatial audio, we introduce the first comprehensive Spatial Audio-Driven Human Motion (SAM) dataset, which contains diverse and high-quality spatial audio and motion data. For benchmarking, we develop a simple yet effective diffusion-based generative framework for human MOtion generation driven by SPatial Audio, termed MOSPA, which faithfully captures the relationship between body motion and spatial audio through an effective fusion mechanism. Once trained, MOSPA can generate diverse realistic human motions conditioned on varying spatial audio inputs. We perform a thorough investigation of the proposed dataset and conduct extensive experiments for benchmarking, where our method achieves state-of-the-art performance on this task.


6 science milestones turning 40 this year

Popular Science

In 1986, we had huge leaps forward, tragic steps back, and life changing innovations. NASA's STS-51L crew members pose for photographs during a break in countdown training at the White Room, Launch Complex 39, Pad B. Left to right are Teacher-in-Space payload specialist Sharon Christa McAuliffe; payload specialist Gregory Jarvis; and astronauts Judith A. Resnik, mission specialist; Francis R. (Dick) Scobee, mission commander; Ronald E. McNair, mission specialist; Mike J. Smith, pilot; and Ellison S. Onizuka, mission specialist. Breakthroughs, discoveries, and DIY tips sent every weekday. It was a year that saw roughly six million Americans hold hands in a continuous (more or less) line across the country to raise money for homelessness. A news anchor named Oprah Winfrey debuted her new talk show.


CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes

Neural Information Processing Systems

Causal video question answering (QA) has garnered increasing interest, yet existing datasets often lack depth in causal reasoning. To address this gap, we capitalize on the unique properties of cartoons and construct CausalChaos!, a novel, challenging causal Why-QA dataset built upon the iconic Tom and Jerry cartoon series. Cartoons use the principles of animation that allow animators to create expressive, unambiguous causal relationships between events to form a coherent storyline. Utilizing these properties, along with thought-provoking questions and multi-level answers (answer and detailed causal explanation), our questions involve causal chains that interconnect multiple dynamic interactions between characters and visual scenes. These factors demand models to solve more challenging, yet well-defined causal relationships. We also introduce hard incorrect answer mining, including a causally confusing version that is even more challenging. While models perform well, there is much room for improvement, especially, on open-ended answers. We identify more advanced/explicit causal relationship modeling \& joint modeling of vision and language as the immediate areas for future efforts to focus upon. Along with the other complementary datasets, our new challenging dataset will pave the way for these developments in the field.


The Truth About the em Avatar /em Movies That No One Wants to Accept

Slate

James Cameron is desperate to convince the world that these movies aren't "cartoons." Enter your email to receive alerts for this author. You can manage your newsletter subscriptions at any time. You're already subscribed to the aa_Sam_Adams newsletter. You can manage your newsletter subscriptions at any time.


PESTalk: Speech-Driven 3D Facial Animation with Personalized Emotional Styles

arXiv.org Artificial Intelligence

PESTalk is a novel method for generating 3D facial animations with personalized emotional styles directly from speech. It overcomes key limitations of existing approaches by introducing a Dual-Stream Emotion Extractor (DSEE) that captures both time and frequency-domain audio features for fine-grained emotion analysis, and an Emotional Style Modeling Module (ESMM) that models individual expression patterns based on voiceprint characteristics. To address data scarcity, the method leverages a newly constructed 3D-EmoStyle dataset. Evaluations demonstrate that PESTalk outperforms state-of-the-art methods in producing realistic and personalized facial animations.


AnimAgents: Coordinating Multi-Stage Animation Pre-Production with Human-Multi-Agent Collaboration

arXiv.org Artificial Intelligence

Animation pre-production lays the foundation of an animated film by transforming initial concepts into a coherent blueprint across interdependent stages such as ideation, scripting, design, and storyboarding. While generative AI tools are increasingly adopted in this process, they remain isolated, requiring creators to juggle multiple systems without integrated workflow support. Our formative study with 12 professional creative directors and independent animators revealed key challenges in their current practice: Creators must manually coordinate fragmented outputs, manage large volumes of information, and struggle to maintain continuity and creative control between stages. Based on the insights, we present AnimAgents, a human-multi-agent collaborative system that coordinates complex, multi-stage workflows through a core agent and specialized agents, supported by dedicated boards for the four major stages of pre-production. AnimAgents enables stage-aware orchestration, stage-specific output management, and element-level refinement, providing an end-to-end workflow tailored to professional practice. In a within-subjects summative study with 16 professional creators, AnimAgents significantly outperformed a strong single-agent baseline that equipped with advanced parallel image generation in coordination, consistency, information management, and overall satisfaction (p < .01). A field deployment with 4 creators further demonstrated AnimAgents' effectiveness in real-world projects.